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Introduction. 


Optimal  stochastic  control  problems  based  on  Gaussian  white  noise  models  have 
been  studied  by  Wonham  [1  j- [3] ,  Haussmannn  [4]  and  Ahmed  [5],  among  many  others. 
Rishel  [6] [7]  has  studied  the  control  problem  for  systems  with  jump  Markov  distur¬ 
bances.  In  this  paper  we  investigate  the  linear  quadratic  regulator  control  problem  with 
state-  and  control-dependent  Poisson  noises  in  the  state  dynamics.  As  in  Wonham’s 
treatment  of  LQG  problems  [2]  [3],  the  optimal  control  law  is  a  combination  of  a  linear 
feedback  and  a  feed-forward  control,  which  is  similar  to  those  for  LQG  problems.  The 
optimal  cost  and  optimal  control  are  computed  by  solving  a  class  of  Riccati-like  equa¬ 
tions.  When  the  coefficient  matrices  in  the  system  dynamics  and  the  performance  index 
are  constant,  we  consider  the  infinite  time  problem  with  average  cost  criterion.  The 
optimal  control  exists  and  is  also  the  combination  of  a  constant  linear  feedback  and  a 
constant  feed-forward  control  obtained  by  solving  an  algebraic  Riccati-like  equation.  In 
addition,  the  optimal  control  of  the  finite  time  problem  converges  to  the  time-invariant 
control  of  the  infinite  time  problem  quasi-uniformly,  almost  surely.  Similar  results  follow 
if  we  use  a  discounted  cost  criterion. 

The  optimal  stochastic  scheduling  of  systems  with  Poisson  noise  disturbances  is 
treated  in  [8].  Almost  sure  stochastic  stability  of  linear  systems  with  Poisson  noise  dis¬ 
turbances  is  treated  in  [9],  See  also  [10]. 

1.  Finite  Time  Control  Problems. 

In  this  section,  we  consider  the  stochastic  control  system 

dx  (t)  =  A  {t)x  (t)dt  +  B(t)u(t)dt  +  C  (t)x  (t)dN1(t)  (1.1) 

+  D  (t)u{t)dN2(t)  +  £( t ) dN 3( t ) 
x  (0)  =  x  o  e  JR"  \  {0} 

where  A  (t)  and  C  ( t )  are  n  Xn  matrices;  B{t)  and  D  (t)  are  n  X  m  matrices;  and  ) 
is  an  n -vector.  For  simplicity,  we  assume  A(t),  B(t),  C  (t )  and  D  (t )  are  piecewise 


continuous  on  [0,  T}.  Ni  (t),  i=l,  2,  3,  are  independent  Poisson  processes  with  intensities 
X,- ,  i  =1,2,3,  respectively,  together  with  the  performance  index 

T 

J(u)  =  E  {J  L  (t  ,x  (t),u  (t))dt  +  l(T,x(T))}  (1.2) 

o 

Let  the  admissible  control  set  be 

Uai  =  {u  (t )  =  <f>{t  ,x  (t))  |  (j>  :  [0,7,]XIR,1  — ►  1R"  is  piecewise  continuous 
in  t  f  or  each  x  £  IR” ;  locally  Lipschitz  continuous  and 
at  most  linear  growth  in  x  for  each  t  £  [0,  T}} 

We  want  to  minimize  J  (u  )  with  respect  to  all  controls  u  £  Ua(i . 

Lemma  1.1  (Sufficient  condition  for  optimality)  Suppose  there  exists  a  con¬ 
trol  u°  £  Ua(i  and  a  continuous  functional  V:[0,T]  X  IR”  — ►  IR  with  continuous  partial 
derivatives  with  respect  to  t  and  x  such  that 

Vt  (t  ,x )  +  min  {Lu  V(t  ,x)  +  L  ( t  ,x  ,u  )}  =  0  /•, 

is  achieved  at  u°(t  ,x )  for  all  ( t,x)  £  [0,  TJ  X  IR"  with 

V(T,x)  =  l(T, x),V  x  £  IR”.  (1.4) 

Here  Lu  is  the  differential  operator  corresponding  to  the  state  process  defined  by 

Lu  V (t  ,x )  =  Vx  '  (A  (t  )x  +B  (t  )u  )  +  Xj[  V (t  ,x  +C  {t  )x  )-V (t  ,x  )] 

+  \2[V(t  ,x +D  (t)u)-V(t  ,x)\  +  X3[L(t  ,x  +^{t  ))-V (t  ,x  )]  (1.5) 

where  '  denotes  matrix  transpose.  Then 

V{0,xf)  =  J  (u°)  =  inf  J  (u). 

u  c  uad 

Proof.  Let  x°(s  )  and  x  (s  )  be  the  trajectories  corresponding  to  control  laws  < j>°  and 
(j),  respectively,  with  initial  state  x°(t)  —  x  —  x(t).  Prom  the  integration  formula  for 
point  processes,  we  have 
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T 

V{t,x)  =  -Et,x{J[Vs(s,x°(s))  +  ^0V(s,x°(s))}ds  +  V(T  ,x°(T))} 

t 

T 

=  EtiX{jL(s,x0(sU0(stx°(s)))ds  +  1{T  ,x°(T ))}. 

t 

But  with  arbitrary  u(t)  =  </>(t,x(t)),  we  have  from  (1.3) 

T 

V(t,x)=-Et'X{J{Vs(s,x(s))  +E  +  V(s,x(s))]da  +  V(T,x(T))} 

t 

T 

<  Et_x{jL(s,x(s)Ms,x(s)))ds  +  l{T,x{T))}. 

t 

Thus,  V(0,x0)  =  J (u°)  <  J(u)  for  all  u  G  Uad  which  shows  that  u°  is  an  optimal 
control  and  F(0,a;0)  is  an  optimal  value. 

QED 

Remark.  Equation  (1.3)  together  with  (1.4)  make  up  the  Hamilton- Jacobi-Bellman 
(HJB)  dynamic  programming  equation  for  this  problem. 

Now,  we  consider  our  system  as  a  regulator,  i.e.,  we  let 

L  (t  ,x  ,u)  —  x  '  Q{t)x  +  u'R{t)u  (1.6) 

/  ( T  ,x )  =  x  '  PT  x 

with  symmetric  piecewise  continuous  matrices 

<3(0  >  0,  R{t)  >  0,  t  e[0,T] 

and 

PT  >  0  constant. 

To  solve  the  (HJB)  equation,  we  use 

V(t  ,x)  —  x  '  P(t  )x  +  2p(t)'x  +  q  (t )  (1.7) 

for  some  n  Xn  symmetric,  non-negative  definite  matrix  P(t),  n -vector  p(t)  and  scalar 
function  q  ( t )  satisfying  the  final  conditions 


3 


P(T)  =  Pt,  P(T)  =  0,  q(T)  =  0.  (1.8) 

Then 

V,(t  ,x)  =  x  '  —P  (t)x  +  2  —p(t)'x  +  — q{t ) 

'  dt  dtFK  dtl  ’ 

Vx  (t  ,x  )  =  2P  (t  )x  +  2p(t).  (1.9) 

Substituting  (1.7)  and  (1.9)  into  (1.5),  we  get 

L u  V (t  ,x)  =  2 [P  (t  )x  +p  ( t )] '  [A  ( t  )x  +B  (t)u  ] 

+  +C  (t)x\'  P(t)[x  +C  (t)x) 

+  2  p  {t)'  [x +C  {t)x]  +  q{t)  -  [x  '  P(t)x +2p  {t)‘ x+q{t)\) 

+  X2{(£  +D  (t  )u  }'  P(t  )[x  +D  ( t)u  ]+2 p  (f ) '  [x  +D  ( t  )u  \+q  ( t ) 

-  [x  '  P(t)x +2p(t)' x+q{t)}}  +  \3{[x+€(t)]'P(t)[x+€(t)] 

+  2 p(t)'  fx+£(t)]+q(t)  -  [x  '  P{t)x+2p(t)' x+q(t)}} 

=  2 [P (t  )x  +p  ( t  )\'  [A  (t)x  +B  ( t  )u  ]  (1.10) 

+  XJz  '  C(t)'  P(t)x  +x  '  P(t)C  (t)x 
+x  '  C(t)'  P{t)C  (t)x  +2 p  {t)'  C(t)x] 

+  X2[«  ' D(t)' P(t)x  +x  ' P(t)D  {t)u 
+u  '  D(t)'  P(t)D  (t)u  +2p  (t)'  D{t)u} 

+  \zmyp{t)x+x '  p(tm)+at)- p(tm)+2P  (t)'  at)i 

Let  the  Hamiltonian  functional  be 

H(t,x,u)  =  L  uV(t,x)  +  L(t,x,u).  (1-11) 

Then  set 

0  =  —  =  2\2[D(t)'P(t)x+D(t)'P(t)D(t)uA-D(t)'p(t)} 
du 

+  2B  (O'  [P  {t)x  +p  {t)\+2R  (t  )u 


which  implies 

u(t)=  -  [R(t)+\2D(t)'P(t)D(t)}~1  [B(t)+\2D(t)Y[P(t)x+p(t)}.  (1.12) 
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In  addition, 


d-IL  =  2[R(t)  +  \2D  (t)'  P(t)D  (t)}  >  0,  V/  t  6  [0,  T  ]. 
du  2 

Thus,  u  (t),  defined  in  (1.12),  minimizes  (l.  11).  Now,  set 

A(t)  =  A(t)  +  \1C(t) 

B{t)  =  B(t)  +  \2D(t) 

R(t)  =  R{t)  +  \2D{t)'P{t)D(t)  (1.13) 

K{t)  =  R{t)~lB{t)'P{t) 
k  (t )  =  R  ( t)~1B(t )'  p{t). 

Substituting  (1.10)  and  (1.12)  into  (1.3),  we  obtain 

x  '  —  P  ( t  )x  +2-— -p  (t)'  x+-^-q  {t)+2[P  (t  )x  +p  ( t )] '  [A  ( t  )x-B  ( t  )K  ( t  )x-B  (t)k  ( t )] 
at  at  at 

+  '  C{t)'  P(t)x  +x  '  P(t)C  (t)x  +x  '  G(t)'  P(t)C  (t)x  +2 p  (t)'  C(t  )x  } 

+  \2{-[K  ( t  )z  +k  ( t )] '  D(t )'  P(t  )x-x  '  P(t  )D  ( t  )[K  ( t  )x  -pk  ( t )] 

+  [K  (t)x  +k  (i )] '  D(t)'  P(t)D  (t){K  (t)x  Pk{t  )]-2 p  (t)'  D(t)[K  {t)x  pk  (i )]} 

+  \z{(i{typ{t)x+x  ' P{tm)+&t)- P{t)Z{t)+2p{t)' «<)} 

+  x  '  Q(t  )x  +[K ( t  )x  +k  ( t )] '  R(t  ){K ( t  )x  pk  ( t )]  =  0. 

After  simplifying,  we  obtain 

*  '  {-^P(t)+{A(t)-B(t)K(t)V  P(t)+P(t){A(t)-B(t)K(t)} 

at 

+  \1C(t)'P(t)C(t)  +  k(t)'R(t)k(t)+Q{t)}x 

+  2x'{-%-p{t)+[A(thB(t)k(t)}-p(t)+\3P{t)at)} 
at 

+  {-^q(t)-k(t)'R(t)k(t)+\3(i(t)'P(t)at)+2\3p(t)'at)}  =  0. 

As  we  vary  x  £  IRn  ,  we  obtain  the  following  equations 
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-^P(t)+[A(t)-B(t)K(t)VP(t)+P(t)[A(t)-B(t)K(t)}  +  \1C(t)'P(t)C(t) 

+Q(t)+K(t)'R(t)K(t)  =  0  t1 

P{T)  =  Pt 


k-p  (t  )+[A  ( t  )~b  {t  )k{t )] '  P(t)+\3p(t  )at )  =  o 

p(T)  =  o 


(1.15) 


k-q  (t  )-k  ( t)'R(t)k(t  )+2 \3at  )'p{t)  +  X3a  t  )'P{t  )£(f )  =  0 

q(T)  =  0. 


Since  R  (t)  and  K( t)  only  involve  P{t),  (1.15)  and  (1.16)  are  easily  solved  once  we 
solve  (1.14).  Equation  (1.14)  is  well-known  to  have  positive  solutions  if  D(t)  ~  0.  To 
treat  our  case,  we  use  the  methods  of  quasi-linearization  and  successive  approximation  as 
in  Wonham  [2]  to  show  existence  and  uniqueness  of  the  solution  of  (1.14).  Note  that  we 
always  have  a  minimum  property 

[A  ( t  )-B(t )k(t  )\'P(t  )+p (t  )[i  (t  yb ( t )k(t  )}+k(t)-R  (t)k(t ) 

=  [A(t)-B(t)K(t)}'P(t)  +  P(t)[A(t)-B(t)K(t)}  (1.17) 

+  K  (t)'  R  (t)K(t )  -  [K{t)-K{t)Y  R{t)[K{tyk{t)} 

for  any  mXn  matrix  K(t).  Let 

*(P(t),R(t),K(t))  t  [A(t)-B(t)K(t)yP(t)  +  P(t)[A(tyB(t)K(t)\ 

+  \1C{t)'P(t)C(t)  +  IC(t)'  R{t)K(t).  (1.18) 

Then  (1.14)  becomes 


4^(0=  -  v(P  (t  ),R  (t  ),K  (t ))  -  Q(t) 
(it 

P{T)  =  Pt. 


(1.19) 


For  each  K(t),  let  <$>(t,s;K)  be  the  transition  matrix  of  A  {t  )-B  (t  )K  (t ).  To  show 
that  (1.19)  has  a  solution,  we  choose  K^t)  arbitrary  and  let 
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P\t)  t  &(T,t;K1)'PTMT,t-,K1),  t  e  [o ,T], 
Pk+1(t )  t  <*>{T A'KJ'Pt^T A.KJ 

T 

+  J<t>(s  ,t-,Kry  fk^is)'  Pk  (s)C(s) 

t 

+  K1(s)'Rk(s)K1(s)+Q(s))<t>(s  ,f,Kk)  ds 

where 

Rk(t)  t  R(t)  +  \2D(t)'Pk(t)D(t). 

Since  all  the  functions  involved  are  bounded  on  [0,  T],  the  sequence  in 
seen  to  have  a  limit 

P^t)  t  lim  Pk(t). 

k  — ►  oo 

Since  PT  >  0  and  is  symmetric,  from  (1.20)  we  know  Pk(t)  >  0  and  L 
all  k  —  1.  Thus,  P  t(t )  >  0  and  is  symmetric.  Now,  let 

K2(t)  t  R  )'  Px{t ) 

and 

R  x{t)  =  lim  Rk{t)  =  R(t)  +  \2D(t)'Pl(t)D(t)  >  ( 

k  — »oo 

Using  K2(t),  we  can  use  an  iteration  similar  to  (1.20)  to  obtain  P2  and 
this  process,  we  have  Pk  ,  R  k  and  Kk  such  that 

Pk  —  +  ^2^  (O  Pk  (t)D  (t) 

Kk+1{t)  =  Rk{tT1B{tyPk{t) 

and 

Pk{t)  >  0,  V  t  e  [0 ,T] 

■^Pk(t)  +  MPk(t),Rk(t),Kk(t))  +  Q(t)  =  0 
at 

Pk(T)  =  PT. 

From  the  minimum  property  (1.17),  we  have 


(1.20) 


(1.20)  is  easily 


symmetric  for 


2.  Continuing 


(1.21) 


(1.22) 
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(f  )  +  '&(Pk  (t  )’P  k  {t  )>Kk+  i(t  ))+Q  (*  ) 


+  [Kk  (t  y~Kk+l(t  )]  P  k  (t  )\Kk  (t  )~Kk+  i(t  )] 


-r-Pjfc  (t )  +  y(Pk  (t)>R  k  (t)>Kk  (O)  +  Q(t) 

at 


=  ~Pk+i(t)  +  ^(pk+1(n,Rk+1(t),Kk+1(t))  +  Q(t ). 


Let  S(t)  =  Pk  (t)  -  Pk+l(t),  then 


-jrSit  )+*(S(t  ),R  k  (t  )-R  k+1(t  ),Kk+1(t )) 
at 

+  [Kk  (t  )-Kk+1(t )}' R  k(t  ){Kk(t  )-Kk+i(t  )]=0 
S(T)  =  o. 


(1.23) 


Since  the  last  term  in  (1.23)  is  symmetric  and  >  0,  we  know  that  S (t )  >  0  as  in  (1.20). 
Thus,  Pk(t)  >  Pk+1(t)  >  0.  By  the  monotone  convergence  theorem, 

P(t)  t  lim  Pk(t) 

k  —►oo 

exists  and  is  symmetric  so  that  P  x{t)  >  P(t)  >  0.  Again 

R(t)  =  lim  [/?(<)  +  \2D  (O' Pk  (t  )D  (t )] 

k  ->oo 

=  R(t)  +  \2D(t)'P{t)D(t)  >0,  v  t  e  [o,rj. 


Thus  R  (t)  1  exists  and 


K{t)  t  lim  Kk+1(t )  =  R  [t  )~1B  (t)'  P  (t). 

k  — ^oo 


If  we  let  <h(<  ,s  )  be  the  transition  matrix  of  A  ( t ),  then  (1.22)  implies 


Pk(t)  =  *(T,t)'PT*{T,t)  +  /*(«,*)'  \\1C(s)'Pk(s)C(s)+Q(s) 

t 

-Kk  (s)'B(s)'Pk(s  )-Pk  (s  )B  is  )Kk  (s  ) 

+  Kk(s)'  R  k(s  )Kk  (s  )]  <L(s  ,t  )ds. 


Since 
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and 


nnii<  up. ii.  ii**ii<  iipii  +  x.iicihipji 

ii*»ii  <  iip-‘ii  l|B|l  IIP. ii. 

then  we  apply  the  Dominated  Convergence  Theorem  to  get 

T 

P{t)  =  4>(T  ,ty  Pt<S>{T  ,t)  +  J  4>(s  ,t ) '  [\1C(s)'P(s)C(s)+Q(s) 

t 

-K  (s  ) '  B  (s  ) '  P  (s  )-P  (s  )B  (s  )K  (s  )+K  (s)'  R  (s  )K  (s  )]  <L(s  ,t  )ds 

after  k  tends  to  oo.  Thus,  P( t)  is  a  solution  of  (1.19).  Since  ^>{P  ,R  ,K)  is  locally 
Lipschitz  in  P ,  P  (t)  is  a  unique  solution  of  (1.19).  In  addition,  let  K(t)  be  arbitrary 
and  P{t)  be  the  solution  of  (1.19)  with  K{t)  replacing  K  (t)  and  P{T)  —  PT  .  Then 
by  the  minimum  property  (1.17),  we  have 

0  =  +  V(P(t),R  p{t),K{t))  +  Q(t) 

at 

=  P(t)  +  +  Q  (t ) 

=  P\t )  +  V(P(t),R{t),K(t))  -  [K(t)-K(t)}'R(t)[K(t)-K(t)}  +  Q{t), 

where 

R  P{t)  =  R{t)  +  \2D  (f)'  P(t)D  (f). 

Let  S  (t)  =  P  (t)  -  P  (t).  We  obtain 

4-S(t)+V(S(t),R  p(t)-R(t),K(t))+[K(tyK{t))'R(t)lK(tyK(t)}  =  0 
at 

S{T)  =  0.  (1.24) 

Since  the  last  term  in  (1.24)  is  non-negative  definite,  we  have  5(f)  >  0,  so  that  P(t) 
>  P(t).  The  solutions  of  (1.15)  and  (1.16),  respectively,  are  thus  easily  shown  to  be 
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(1.25) 


T 

P(t)  =  X3 / <h(s  ,t  ;K)'P(s  )f(s  )ds 

t 

and 

T 

q(t)  =  f[2\3€(s)'p(s)+\3€(s)'P(s)€(s)-k(s)'R(s)k(s)]ds  (1.26) 

t 

with 

i(0  =  R(tT1B(t)'p{t).  (1.27) 

Consequently,  we  have  proved  the  following  theorem. 

Theorem  1.1.  The  stochastic  linear  quadratic  regulator  (1.1)  with  (1.2)  and  (1.6) 
has  an  optimal  control 

u(t)=  -  \K{t)x{t)  +  k(t)]  (1-28) 

where 

K  (( )  =  [R(t)+\2D(t)'P(t)D(t)}-1[B(t)+\2D(t)}P(t)  (1.29) 

k(t)  =  [R  (t)+\2D  (O' P  (t)D  (0]_1[5 (t)+\2D  (0]p  (t). 

The  optimal  value  is  J(u)  =  F(0,x0)  =  x0'P(0)x0  +  2p(0)'x0  +  q(0),  with 
P{t)>  0,  p{t)  and  q{t)  being  the  unique  solutions  of  (1.14),  (1.15)  and  (1.16),  respec¬ 
tively.  And  x{t)  is  the  optimal  trajectory  o/(l.l)  corresponding  to  u  as  input  control. 

2.  Infinite  Time  Stochastic  Control. 

To  investigate  the  infinite  time  problem  (1.1)  (1.2)  as  T  — ►  oo,  we  consider  the 
case  when  the  coefficient  matrices  in  (l.l)  and  (1.6)  are  constant,  i.e.,  A(t)  =  A  ,  B(t) 

=  B,  (7(0  =  c ,  D(t)  =  D  ,  f(0=£,  Q(t)  =  Q  ,  R(t)  =  R  and  let  PT  =  0 

for  convenience.  We  would  like  to  have  a  stationary  feedback  control,  i.e.,  P(t)  =  P , 
p(t)  =  p  .  But  g  (0)  — >•  oo  as  T  — ►  oo  in  (1.26);  and  so  the  cost  V {0,xQ)  — >■  oo  as  T 
—r  oo.  This  is  to  be  expected  since  the  noise  dN3  acts  continuously  on  the  system,  so  we 
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modify  the  performance  index  (1.2)  to  be  the  average  cost 

T 

Jav  (u  )  =  lim  E  f[z  (t )'  Qx  (f)  +  u{t)'  Ru(t)]dt  (2.1) 

T  —>oo  7  o 

or  the  discounted  cost 

00 

Jd(u)  =  E  J  e~2at  [x  {t )'  Qx  (f)  +  u  (t)'  Ru  (t)\dt ,  a>0  (2.2) 

o 

We  will  discuss  both.  See  [l l]  for  a  general  discussion  of  this  class  of  problems. 

Since  we  want  P  T(t)  — >  P ,  a  constant  matrix,  as  T  — ►  oo  for  each  t,  then  P 
should  be  a  solution  of  algebraic  equation 

^(P  ,R  ,K)  +  Q  =  0  (2.3) 

Before  investigating  solutions  of  (2.3),  we  need  some  preliminary  lemmas  which  are 
adapted  from  Wonham  [2]. 

Lemma  2.1.  Let  G'  G  +  IP  H  =  F'  F  and  M  be  an  arbitrary  matrix  of  compatible 
dimensions. 

(i)  If  (G, A)  is  observable,  then  (F,A+MH)  is  observable. 

(ii)  If  (G,A)  is  detectable,  then  (F,A+MII)  is  detectable. 

Proof,  (i)  Denote 

<A  \  B>  t  {BAB,.  ..  ,An~1B}. 

If  (G,A)  is  observable,  then  the  range  R(<A'\G'>)  —  IR'1  .  Since  x' F' Fx  —  0 
implies  x'  G'  Gx  —  0  and  x'  IF  Hx  —  0,  then  the  null  space  N(F)  C  N(G)  D  N(H). 
We  have 

R  {F  ' )  =  N(F)L  F)  [N(G)f^N(H)]L  (2.4) 

D  N{G  N  (H)1  =  R  (G  ')®R(H'). 

Thus,  R(H'  M' )  c  R(H')  C  R(F' ).  Since 
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R{<A  '  +F1'\F  '  >  =  R(<A  '\ F'>) 


for  any  F x'  such  that  range  R  {F x ' )  C  R  {F  ' ),  then 

R{<A  ' +H 'M'\F '>)  =  R{<A  '  |F  '  >)  D  (<A  '|G">)  =  IR\ 

so  that  R  (<_A  '  +H  ' M'  \F  '  >)  =  IR”  and  (F  ,A  +MH )  is  observable. 

(ii)  If  (G  ,A)  is  detectable,  then  the  unstable  part  of  A  is  controllable.  From 
(2.4),  we  know  R  (G  '  V' -H  '  M' )  C  R  (F  ' )  for  any  V  ' .  Thus,  there  exists  a  matrix 
U  '  such  that  G  '  V'  -H  '  M'  —  F  '  U' ,  so  that 

A  '  +  G  '  V'  =  A  '  +  H  '  M'  +F'U' . 


This  shows  that  if  we  can  find  V  '  to  reposition  the  unstable  eigenvalues  of  A  '  to  any 
desired  locations,  we  can  find  U'  which  does  the  job  for  A  '  +H  '  M' .  Hence, 
(F , A  +MH)  is  detectable. 


QED 


Lemma  2.2.  If  (G, A)  is  detectable ,  then  either  A  is  stable  or 

t 

W  (t  ,A  ,G)  t  J  e(A's>G  '  G  e{As)  ds  (2.5) 

0 

is  unbounded  as  t  — +  oo. 

Proof.  If  A  is  not  stable,  then  there  exists  an  eigenvalue  X  of  A  such  that  ReX  > 
0  and  an  eigenvector  rj  ^  0.  Thus, 

t 

rj*W(t  ;A  ,G)r]  —  f  e2sReX  \  GV  \  2ds  ,  (2.6) 

o 

where  *  denotes  transpose  and  complex  conjugate.  If  (2.5)  is  bounded  as  t  — >  oo,  from 
(2.6),  we  must  have  G  t]  =  0,  so  that  GA  k  ?/  =  \k  G  r)  —  0  for  any  k  >  0.  Thus, 

Re7/andIm?/GAf(<H  '  ,G  '  >  ' ) 


12 


Let  EA  and  EA  be  the  generalized  eigenspace  of  A  corresponding  to  non-negative  and 
negative  eigenvalues,  respectively.  Then  EA@EA  =  IRn  .  Since  (G,A)  is  detectable, 
E/-  C  R{<A  '\G  '>).  Thus, 

N (<A  '\ G  '>')  =  R(<A  '| G  ' >)L  C  (E/-  )l  =  Ea. 

Hence,  Re  rj  and  Im  rj  E  EA  p  EA  =  {0}  which  implies  r]  =  0,  a  contradiction. 

QED 

Lemma  2.3.  Let  (G,A)  be  detectable  and  suppose 

A  '  P  +  PA  +  l(P)  +  G  '  G  =  0  (2.7) 

has  a  solution  P  >  0  with  l  (P )  being  linear  in  P.  Then  A  is  stable.  Let  A  be  formally 
defined  as 

00 

A (S)  =  J  exp  (A  '  t)l(S)exp  (At)dt  (2.8) 

o 

withAk(S)  =  A(Ai_1(5')),  A°(5 )  =  S.  If  (G, A)  is  observable,  then 

00 

S  Ak{S)  =  {I-A)-\S)  (2.9) 

k  =0 

for  any  n  Xn  symmetric  matrix  S ,  and  P  is  the  unique  solution  of  (2.7)  of  the  form 

OO 

P  —  (/-A)-1  /  e{A't]  G  '  G  e{At)  dt 

o 

dP 

Proof.  Since  P  is  constant,  -  =  O,  then  P  is  the  solution  of 

dt 

—S(t)  +  A  '  S{t)  +  S{t)A  +  l(S)  +  G'G=  0 
dt 

S{T)  =  P. 

Thus, 
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(2.10) 


T 

P  =  e{A'T]Pe^AT)  +  J  e{A'‘\l{P)+G  '  G\e{At)  dt. 

o 

Let 

t 

Zt  t  f  e(A  'T]  G  'Ge{AT)  dr. 

o 

Then  (2.10)  shows  P  -  ZT  >0  since  P  >  0.  Thus,  ZT  is  bounded  in  T ,  V/  T  >  0. 
Since  (G  ,A)  is  detectable,  by  Lemma  2.2,  we  know  A  must  be  stable.  From  (2.7),  P 
has  the  form 

p  =  A(P)  +  Z  qq  (2.11) 

=  A k+\P)  +  V  k  >  0.  (2.12) 

i  =0 

Since  >  0  and  A  is  linear,  the  series  exists  and 

00 

£  A*  (Zoo)  <  P- 
k  =  0 

Suppose  (G , A  )  is  observable,  then  Z m  >  Zt  >  0,  Y/  t  >0.  If  S  >  0  and  symmetric, 
then  S  <  aZ <*>  for  some  e*  >  0.  Thus, 

00  oo 

o  <  £  a k(S)  <  «£  A k{Z00)  <  aP 

k=  0  k  =0 

which  shows  that  the  series  in  (2.9)  converges.  If  S  is  symmetric,  but  not  non-negative 
definite,  then  ^  symmetric  matrices  S1  >  O  and  S2  >  0  such  that  S  —  £  -  S^. 

OO  OO  OO 

Hence,  ^Ai(5')=  A4  (£)  -  J]  A k  (S 2)  converges  since  both  series  converge. 
k  k  =0  k  —0 

Thus  (2.9)  is  established.  In  particular,  Ak(S)  — ►  0  as  k  — ►  oo.  From  (2.11),  P  = 
(I-A)-1^). 

QED 

Lemma  2.4  (Minimum  principle).  Let  P  >  0  satisfy  ^(P  ,R  ,K)  +  Q  =  0 
with  R  =  R  +  \2D  '  PD  and  K  —  R~rB  'P .  Suppose  S  >  0  satisfies 
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ty(S  ,RS  ,K )  +  Q  =  0  for  some  matrix  K  with  Rs  =  R  +  \2D  '  SD  .  Let  Q  — 
G'  G.  If  (G, A)  is  detectable,  then  ( A  -B  K)  is  stable.  In  addition,  if  (G,A)  is  observ¬ 
able,  then  P  <  S. 

Proof.  Let  G'G  +  K'RSK  =  F'F  with  H  =  RS1/2K  and  M  =  -  B  'Rs~1/2 
for  Rs  >  0.  By  Lemma  2.1,  (F  ,A  -  B  K)  is  detectable.  From  Lemma  2.3  with 
/  (P )  =  \XC  'PC ,  we  know  ( A  -  B  K)  is  stable.  By  the  minimum  property  (1.17),  we 
have 

0  = 'I /(S,Rg,K)  +  Q 
=  vp(p,p  ,k)  +  Q 

=  V(P.R,K)  +  Q  -  (K-k)'R(K-k). 

Let  V  —  S  -  P  ,  we  have 

*{V,RS  -R  ,K)  +  (K-k)'R(K-K)  =  0  (2.13) 

and  Rs  -  R  —  X2D  '  VD  .  Consider  a  linear  function 

l(V)  =  \1C'VC  +\2K'D'VDK  (2.14) 

and  A  as  defined  in  (2.8)  with  A  -BK  replacing  A.  If  {G  ,A)  is  observable,  then 
(P  ,A  -  B  K)  is  observable  and  the  corresponding  series  in  (2.9)  converges.  In  particular, 
Ak  (V)  — >•  0  as  k  — ►  oo.  From  (2.13), 

V  =  A(P)  +  z 

where 

OO 

Z  t  J  eiA-BKyt(K-kyR(K-k)elA-BK)t  dt. 

0 

Thus, 

k 

V=Ak  +  \V)  +  £A!(Z) 

i=  0 

oo 

=  Yj  Al(P)  >0  as  k  — >  oo. 

%  =o 
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Hence,  P  <  S. 

QED 

Now,  we  show  that  under  certain  conditions,  (2.3)  has  a  solution  P  >  0.  First, 
suppose  (A  ,B  )  is  stabilizable.  Then  A  K  such  that  ( A  -  B  K )  is  stable;  and 

'I '(P  ,Rp  ,R  )  +  Q  =  0 

is  equivalent  to 

OO 

P  =J  e(A  ~BKyt[\2C  'PC  +  Q+K  'RPK}ex{A  ~BK)t  dt. 

0 

=  /  {P  ,Rp,K).  (2.15) 

To  make  /  a  contraction  for  each  K ,  we  have  to  impose  a  condition  on  l  defined 
in  (2.14). 

Condition  (I): 

inf  (  11/  e  ^  -BK)  1 1  (j)  e  (i  -  BK)t  dt  \\  such  that  (A  -  B  K)  is  stable  |  <1. 

'0  ’ 

Since 

-\\P\\l(I)<l(P)<  ||P||/(/), 

condition  (I)  implies  for  some  K 1  and  0  <  1  such  that 

OO  _ 

||J  e{A  ~BKlYt  1{P)  e(A  ~BKl)‘  dt  ||  <  9\\P  ||,  V-  P- 

0 

Thus, 

\\f  (P1,RPl,K1)  -  f  (P2,RPq,K1)W  <«||P1-i>1||. 

By  successive  iteration 

P1  =  o 
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with  R  k 


with 


Pk+1  =  /  (Pk  ,Rk  .KJ,  k  >  l 


R  +  \2D  '  Pk  D  ,  we  have  a  fixed  solution 


Pi  =  f  (P»R  1.K1) 


R  x  —  R  +  \2D  'PJ>. 

Since  Pk  >  0,  we  have  Px  >  0.  Let 

K2  =  R1-1B'P1. 

By  the  minimum  property  (1.17),  we  have 

0  =  V(P  VR  VK  J  +  Q 

=  *(PvRvK2)  +  (K^KJ'R^K^KJ  +  Q.  (2.16) 

Let  G  '  G  —  Q  ,  H  —R  x1/2  (K^KJ  and 

F  '  F  =  G  '  G  +  H  '  H. 

Suppose  (G  ,A)  is  detectable.  By  Lemma  2.1,  (F  ,A  )  is  detectable.  From  Lemma  2.4, 
( A  -BK2)  is  stable.  Let 

P2>  =0 

P2+1  =  /  (Pk2,Rk2,K2) 

with 

Rk  t  R  +  \2D  ' PkD. 

Since  P 2  >  0  =  P2  ,  and  assuming  P2  >  P2_1  ,  we  have 

OO 

pk+i  „  Pk  =  J  e{A  ~BK*yt  l\tC  '  (Pt-P^-^C 

0 

+  \2K2D'  (P\-P\~x  )DK2]  e(A  ~BKi)t  dt  >  0 
by  the  induction  hypothesis.  Thus,  P2+1  >  P2  >  0,  V/  A;  >  1.  We  want  to  show 
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that  {P  *  }  is  bounded.  From  (2.16), 


P,  =  5  eiA-BKi)'t[\1C'P1C  +  Q+K2'R1K2 

0 


+(K1-K2)'R1(K1-K2)}  e(A  ~BK2)t  dt, 


so  that 

00 

Pi  -  P  2+1  =  /  eiA-BK2)'t[\1C-(P1-Pk2)C  +  \2K  2  D'  (P  1-P2)DK2 
0 

+  (K^KJ'R^K^KJ]  eiA  ~BI<2)t  dt  >  0 

by  induction,  since  Pl  >  0  =  P2  .  Thus, 

P2  t  lim  P\ 

k  — >0 o 

exists  and  0  <  P2  <  Px.  Repeating  the  above  procedure,  we  obtain  sequences  {Pk  }, 
{Ric  }  and  {Kk  }  such  that 

Rk  =  R  +  \2D  '  Pk  D  , 

Pk+  i  =  '  Pk 

and 

o  <  n+i  <  n  <  Pr,  v  k  >  i. 

Hence,  P  =  lim  Pk  exists  and  P  >  0.  Furthermore, 
k  — >oo 

R  =  lim  Rk  —  R  +  \2D  '  PD 

k  — *oo 

*  lim  Kk  =  RlB  '  P  . 

k  — *  OO 

Since  ,/<"*)  +  Q  =  0,  passing  to  the  limit,  we  have  ^(P  ,P  ,K )  +  Q  —  0.  By 

Lemma  2.4,  (A  -  B  K)  is  stable.  If  (G,A)  is  observable,  then  P  is  the  minimum  non¬ 
negative  solution  of  'i'iS  ,RS  ,K )  +  Q  =  0,  so  P  must  be  the  unique  solution  of  the 
class  S  >  0.  To  show  P  >  0,  we  proceed  as  follows:  If  P  =0,  then  R  —  R  and  K 
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=  0,  so  then 


0  =  P  =  f  e(A  -Bkyt[\1C'P  G+Q+K  'RK)e{A  ~BK)t  dt , 

0 

00 

=  /  e(X'°G  '  Ge{At)  dt 
0 

Thus,  G  e A  1  =  0  for  all  t  >0  which  contradicts  (G,A  )  being  observable.  We  sum¬ 
marize  this  result  in  the  following  theorem: 

Theorem  2.5.  If  (A,  B)  is  stabilizable,  (Q  ,A)  is  detectable  and  condition  (I)  is 
satisfied,  then  ty(S  ,RS  ,KS)  -T  Q  =  0  has  at  least  one  solution  P  >  0.  The  matrix 
{A  -  B  R  B  '  P)  is  stable  with  R  =  R  +  \2D  '  P  D  .  In  addition,  if  (Q  ,A)  is  observ¬ 
able,  then  P  is  unique  among  the  solution  set  S  >  0  and  indeed  P  >  0. 

Now,  let  P T(t)  be  the  unique  solution  of 

4rpT(t)  +  MPT(t),RT(t),KT(t))  +  Q  =  o 

dt  (2.17) 

pt(T)  =  0 

with 

RT(t)  =  R  +  \2D  ' PT(t  )D 
Kr(t)  =  RT(tT1B'PT(t). 

Suppose  the  hypotheses  in  Theorem  2.5  are  satisfied,  then  =)  P  >  0,  a  solution  of  the 
algebraic  equation  (2.3).  From  the  minimum  property  (1.17), 

^{Pf  (f  ),Rj  {t  ),Rt  (t ))  /  Q 

=  V(PT{t),RT(t),K)  +  Q  -  [K-Kt  (t )] '  Rt  (t  )[K-Kt  (t )]. 

Let  ST(t)  =  P  -  PT(t).  Then 

—ST(t)  +  *(ST(t),R-RT(t),K(t))  +  [k-KT{t)}'RT(t)[k-KT(t)}  =  0 
dt 

St(T)  =  P.  (2.18) 
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From  the  previous  argument  using  quasi-linearization  and  successive  approximations  on 
(2.18),  we  know  ST(t)  >  0,  i.e.,  Pp(t)  <  P  for  all  0  <  t  <  T  which  shows  Pp(t)  is 
uniformly  bounded  V  T  >  0. 

Lemma  2.6.  If  P  x  and  P2  are  solutions  of 

~P{t)  +  *(P(t),RP(t),KP(t))  +  Q  =0  (2.19) 

with  terminal  conditions  P f  and  P2°  at  T  >  0,  respectively.  Suppose  0  <  P”  <  P 2°  . 
Then  0  <  P  x(t)  <  P2(t ),  t  e  [0  ,T\. 

Proof.  By  the  minimum  property  (1.17), 

MPft),RPft),KPft))  =  *{Pft),RPft),KPf  t)) 

-  [KpJ. t  )~Kp  j(0] "  Rp  f  t )  [Kp  f  t  )-KP  f  f )] . 

Let  5(£ )  =  P2(t)  -  Pi(t).  Then 

— P(0  +  ^{S  (t  ),RPft)-RP  ft),KPift)) 

+  [Kpft  )~Kpft )}'  RP  ft  )[KPft  )~KP  ft )] 

S(T)  =  P2°  -  P°  >  0.  (2.20) 

Since  the  last  term  in  (2.20)  is  non-negative  definite,  we  can  show  the  solution  S(t)  >  0 
as  before  by  successive  approximation.  Thus  P  ft)  <  P2(i ),  V  0  <  t  <  T . 


QED 

Suppose  Py  and  Py2  are  solution  of  (2.17)  with  Tj  <  P2.  Since 
Pt2(T i)  >  0  =  PxfTi),  we  have  Prft)  >  Pt^)>  t  G  [0, T1 1],  by  Lemma  2.6.  Since 
PT  (t)  <  P  ,  then 

P«>(0  =  lim  ^r(0 

T  — >oo 

exists  pointwise  and  0  <  PTO(t)  <  P. 
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Suppose  we  can  choose  K  so  that  A  -BK  is  stable.  Thus,  using  the  minimum 
property  (1.17),  the  solution  of  (2.17)  can  be  expressed  as 

T 

PT(t)  =  J  e(A  ~BKy(T~t\\1C  'PT(r)C  +  Q  +  K  '  Rt  (t)K 

t 


-  [K-Kt{t)Y  RT{T)\K-KT{T)}}e(A~BKKT-tUT.  (2.21) 


Since  PT( t)  <  P^t)  <  P  and 

lim  Rj'  (t )  =  R  ~h  \2D  Pqq^I  )D 

T  —»oo 

lim  ICT  (t)  =  R  ^{t  Y'B  '  Pco(f ) 

f  — >oo 

are  uniformly  bounded,  we  can  apply  the  Dominated 
Then  as  T  — +  oo, 


=  Rood)  >  0, 

Convergence  Theorem  to  (2.21). 


p oo(0  =  J  e(A  -BK)'(T~t)  {\i(7  'Poo(t)0 

t 


+  Q 


+  K  '  P00(,r)/C 


-  [/<--/C00(r)]'i200(r)[A:-A:eo(r)]}e(i  dr. 


which  shows  that  satisfies  (2.19). 

Suppose  PT  is  the  solution  of  (2.17).  Set  PT(t)  =  PT(T-t).  Then  PT  satisfies 

J-P(t)  =  #(P  ,RP,Kp)  +  Q 

at  (2.s 

P  (0)  =  0 


Since  (2.22)  has  a  unique  solution,  then  PTi(t)  =  PT^(t)  or  Pf^T^t)  =  PT  (T2-t) 
By  Lemma  2.6,  we  have  PT  (t  i)  >  PT  (t2)  if  0  <  t  x  <  1 2  <  T  .  In  addition, 


P<x>(ti)  =  lim  PtS1  i)  =  lim  P tXT i~t  l) 

T  ±—*oo  T  j— *-oo 


lim  Pt2(T 2^2)  with  T 2  -  £2  —  Pi  ~ 


T 1— >oo 


lim  PT(t2)  =  P  TO(i2) 

1  9— >-00 
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which  shows  that  P ^  is  a  constant  matrix  and  is  again  a  solution  of  the  algebraic  equa¬ 
tion  (2.3).  Consequently,  we  have  the  following  theorem. 

Theorem  2.7.  Let  Pr(t)  be  the  unique  solution  of  (2.17).  Then  PT{t)  is  non¬ 
decreasing  in  T  and  non-increasing  in  t.  If  {A  ,B )  is  stabilizable,  (G  ,A  )  is  detectable 
and  condition  (I)  holds,  then 

lim  PT(t)  =  P oo 

T  — >oo 

Indeed,  for  any  e  >  0,  d 0  such  that  \\Poo  -  PT  {t )  1 1  <  e  for  all  t  £  [0,  T-d0]  and  all 
T  >  d0.  P qq  satisfies  the  algebraic  equation 

y{P  oo’R  oo’Koo)  +  Q  =  0  (2.23) 

with 

P  00  —  P  T  'PooD 

Koo  =  R^B  'Poo  (2-24) 

and  (A  -  B  K ^f)  is  stable.  Furthermore,  if  (G  ,A  )  is  observable,  then  P is  the  unique 
solution  of  (2.23)  in  the  solution  class  S  >  0  and  P  ^  >  0. 

Proof.  The  only  thing  remaining  to  prove  is  the  convergence  of  PT{ t).  Since 
PT(t)  |  Pqq  pointwise  as  T  — ►  oo,  then  for  any  e  >  0,  =t  d0  >  0  such  that 
IIPr(o)  -  Fool  I  <  e  for  all  T  >  d0.  By  the  invariance  property  of  PT(t), 
Pr  (ti)  —  Fj2(£2)  if  Pi  ~  t-i  =  Po  -  *2-  Since  Pr(f)  is  non-increasing  in  t  and 

PT(0  <  Foo>  then 

||Fr(0  -  PooM  <  6  (2.25) 

for  all  t  £  [0,T-d0]  and  all  T  >  d0. 

QED 

Remark  2.1.  Since  B  ,D  and  R  are  constant  matrices,  we  can  show  that  for  any 
e  >  0,  Ej  d o  >  0  such  that 
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I  I  Rt  (t  )  '  «ooll  <  e 

I  \KT(t )  -  K00\\<e 


(2.26) 


for  all  t  G  [0,  T-d0]  and  all  T  >d0. 

To  prove  the  convergence  of  PT(t),  we  rewrite 

— Py(0  +  [A  -  B KT(t)}' Prit)  +  X3P2'(f)£  =  0 

or 

—Pt(1)  +  (A  -BK00)'pT(t)  +  [Koo-Kxit)}' B  ' px(t)  +  \3Px(t)£  =  o 

so  that  with  pT  (T )  =  0, 

T  .  . 

pT(t)  =  J  e{A-BK°°Y(T-t)  {[K^-KrW'B'pxir)  +  \2PT{r)^}dr.  (2.27) 

t 

Since  (A  -  B  K^)  is  stable,  =)  M,  (3  >  0  such  that 

||e(A  - B K os)' (t -s ) 1 1  <  Me  aU  f  >  s 

From  the  convergent  properties  of  (2.25)  and  (2.26),  we  know  that  KT(t)  is  uniformly 
bounded  inf  and  T  and  1 1 PT  (t )  1 1  <  |  \P ^  |  j.  Thus  =)  Mx  >  0  and  M  2  >  0 
T  .  . 

I  Pr(0  I  <  /  ||e(A-BK’“)'(r-<)||  KH^  1 1  +  ||/Cr(r)||)||5||  |  Pt  (r)  | 

t 

+  K\ I  K I  \dr 

T 

<  J  Me  |  pT(r)  |  +  M2}dr 

t 

T 

MM o  p  /□/_  /  \  .  . 

<  — —  +  j  MM1e-^T~t)\pT(T)\  dr. 

P  t 

By  Gronwall’s  inequality,  we  have 
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for  all  t  E  f0,  T }  and  all  T  >  0.  Let  p  ^  be  the  algebraic  solution  of 

(A  -BKJ'p  +  X3Pooe  =  0.  (2.29) 

Then 

P  00=  -  XS[(A  -P/Xoo)' 

00 

=  X3/  e(A-BK°°Y[T-t]  P^dr. 

t 

Prom  (2.27), 

T  .  . 

P  co  -  Pr(0  =  /  +  X8[P00-Pr(T)]e}rfr 

t 

00 

+  f  e(A-BK°°y(T-t]  P^dr. 

T 

Since  ( A  -  B  K <*,)  is  stable,  then 

OO 

r  (A  -BKxy  (r-n  p  ^ 

J  e  Poo?«  T  <  e 

T 

for  all  sufficiently  large  T .  By  the  convergent  properties  in  (2.25)  and  (2.26),  there  exists 
d0  such  that 

T-d0  .  . 

f  e(A-BK°°  y(T-t){{KT(ThK00VB'pT(T)  +  X3[P0O-Pr(r)]e}rfr 

t 

T~d0 

<  f  Me  ~  ^  ]Mae 

t 


24 


MM  3 

<  - 5-  e 

-  P 

for  some  constant  M3  >  0  and  all  t  E  [0,T -d0],  all  T  >  d0.  Since  the  integrand  of  the 
first  integral  of  (2.28)  is  uniformly  bounded  in  t  <  T ,  there  exists  a  M  4  >  0  such  that 


/ 

T-dn 


(A  -BK^yfr-t) 


[(KT(T)-K oo)'  B  '  PT{t)  +  X3(P  OQ--P T  (T)£\d  T 


<  J  M  e M4  dr 

T-d0 


< 


MM4 

~T 


e  ~P(T-drt)  _  e  -  fi(T-t ) 


<  e 


if  T-d0—t  is  sufficiently  large.  From  this  analysis,  we  can  conclude  that  for  each  e  >  0, 
there  exists  d0  <  d1  such  that 

I  P  oo  -  Pt(  O  I  <  e  (2-3°) 


for  all  t  E  [0,T-d0]  and  T  >  dv  Furthermore,  let 

k  OO  P  OO  P  Poo 

^oo(0  ~  [^oo^ooCO  k  oo] * 

In  the  same  way,  we  can  prove  that  for  any  e  >  0,  zi  dx>  dQ>  0  such  that 


I  M  -  MO  I  <  e 


(2.31) 


for  all  t  E  [0,T-d0]  and  all  T  >  dv 

Recognizing  that  qT{ t)  tends  to  infinity  a s  T  — ►  oo,  we  want  to  show  that  on  the 
average  it  tends  to  a  constant  in  the  sense 

T 

lim  -4-  MO  =  lim  /  [2X3£'pr(r)  +  X3£'Pr(r)£  -  (t)  '  Rj  (r)Ay  (r)]ci  t 

t  ->oo  2  r  — >oo  P  J{ 

==  2X3^  Pqq  +  X3£  Poo ?  “  ^00  Poo^  00  (2.32) 

=  2oo 

With  properties  Pr  ->  P„  and  pr  -*■  p^  as  P  -*■  00,  we  have  for  any  e  >  0,  there 
exist  such  that 
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I  \P t  (T)  ~  i’ooll  <  epsilon  ,  \  pT  (r)  -  p  ^  |  <  e 

||  Rt{t)  -  i?oo||  <  epsilon,  |  kT  (r)  -  k  ^  |  <  epsilon, 

for  all  r£  [0,T-g?0]  for  all  T  >  dv  Thus, 


1_ 

T 


Qt  (t )  -  9 oo  =  Y 


S  +  I  N  [Pt(t)-Poo} 

t  T —d  o 

+  W [PtW-P oolf  - 


oo  l^'T^r)  ^  ool^T  (^")  ^  oo  -^'ool^T  C7")  ^ool 


Cl  T 


t_ 

T 


9  00’ 


In  the  second  integral  over  finite  interval  [T  -d0,T],  the  integrand  is  uniformly  bounded, 
so  that  the  average  tends  to  0  as  T  — ►  oo  while  in  the  first  integral,  the  integrand  is  less 
than  some  constant  multiplying  e,  so  that  the  average  tends  to  0  as  e  — ►  0.  Note  that 

~  <?oo  ~ ”  0  as  T  — +  oo  for  each  finite  t.  Hence  the  limit  in  (2.32)  is  uniform  on  each 

compact  interval.  Thus,  for  the  average  cost 


11  1 
lim  —  J(uT)=  lira  —  VT{0,x0)  =  lim  —  qT(0)  =  q^. 

T-+00  T  T-> oo  r  r—  oo  I 

If  there  exists  a  control  u  ( t )  defined  on  |0,oo)  such  that  Jav  (u  )  <  q^,  then  for  some  T0 

such  that  T  >  T 0, 


Y  [J{u)  ~  J («r )]  <  0 

which  implies  J(u)  <  J(uT)  for  all  T  >  T0  and  contradicts  the  hypothesis  that  J (uT) 
is  the  optimal  value  on  [0,T].  Thus,  q  ^  is  the  optimal  value.  We  show 


da  v  (  ^  oo  )  1  oo 


(2.33) 


in  the  next  section. 
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3.  Convergence  of  the  Optimal  Control  and  State 


We  shall  directly  prove  the  convergence  of  the  optimal  trajectory  xT(t)  with  the 
optimal  control  uT(t)  on  [0,  T).  In  this  way  we  avoid  the  difficulty  of  determining  an 
ergodic  probability  measure  for  the  process  x{t )  with  control  u{t )  using  a  Lyapunov  cri¬ 
terion  as  in  [l],  [3]  and  [4]  -  the  usual  method  of  constructing  the  optimal  stationary  con¬ 
trol.  Before  proving  (2.33),  we  need  some  lemmas. 


Lemma  3.1  (Stochastic  Gronwall  inequality).  Let  g(t)>  0,  <j>( t),  /  (t)  and 

h  (t)  be  real  random  functions  such  that 

t  t 

<f>(t)  <  /  (t)  +  J  g  (r)f>(r)d  r  +  f  h  (r)<f>{T)dN (t)  a.s.  (3.1) 

o  o 


where  N(t)  is  a  Poisson  counting  process  ( which  may  be  inhomogeneous)  such  that  it 
counts  the  incidence  during  [0,t].  Then 

t  t 

<  f  (t)  +  J  g(r)f  (T)exp{Jg(s)ds)dT 

0  T 

t  7  7  t 

+  J[f  (l)+ J  9  (T)f  ( T)exP  (jg  is  )ds  )d  Aexp  (fg(s)ds) 

0  0  T  7 

t 

■  h('y)exp(Jh(s)dN(s))dN('y)  a.s.  (3.2) 

7 


In  addition,  if  f  ( t)=  f  ,  a  constant,  then 

t  t 

<j>(t)  <  /  exp  (J  g(s)ds)exp(J  h(s)dN(s))  a.s.  (3.3) 

o  o 


Proof.  Denote  (r,- }  the  interarrival  times  of  N(t)  and  let  t{  =  rx 
Define 

t  t 

fjft)  =  J  g(r)</>(r)dT  +  f  h  (T)<j>(f)dN  (r). 
o  o 


Then  for  0  <  t  <  t  v 


+  Tj  . 


(3.4) 
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t 

<i>(t)  <  f  (t)  +  f  g  (r)<j>(T)d  t 
o 

so  that  by  the  ordinary  Gronwall  inequality,  we  have 

t  t 

<£i(0  <  /  g(r)f  ( T)exp(f  g(s)ds)dr  0  <  t  <  t1 

0  T 

Suppose  for  <  t  <  tit 

t  t 

<  J  9(T)f  ( r)exp(fg(s)ds)dT 

0  T 

i-i  t]  t]  ‘ 

+  E  [/  (lj  )+/  9  M/  (r)exp  (f  g(s  )ds  )d  r}exp  ( f  g  (s  )ds  ) 

j  =  1  0  r  t] 


i-1 

■  h(tj)exp(  J2  h(h))  a.s. 
k  —  j  +1 


Then  from  (3.1)  and  (3.5) 


HU)  < 


t,  t, 

J  g  O’)/  ( T)exp  (fg(s  )ds  )d  r 

0  r 


i-l 


+  E  [/(<,)+/  9(?)f  (r)exp(f  g(s)ds)dr]exp(f  g(s)ds) 

j  —  1  0  T  t] 


i-l 

li  (tj  )exp  (  h(h)) 

k  =y +i 


|(l+/*  0; ))+/ 


a.s. 


t  M 


so  that  for  ti  <  t  <  ti+1 


(3.5) 
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4>(t)  <  f  (t)  +  M  +  f  g  (f)(j){r)d  r 


Again,  by  the  ordinary  Gronwall  inequality, 


f  9  (j)<i>(.T)d  t  <  J  g  (r)(/  (r)+M)exp  (J g  (s  )ds  )d  r  a.s. 


—  f  9  (r)/  if) exp  (f  g(s  )ds  )d  t  -  M  +  M  exp  (fg(s  )ds  )  a.s. 


Thus,  for  ti  <  t  <  L  ,,, 


Ut)  <  M  +  f  g(T)<f>(T)dr 


<  f  g  (r)/  {r)exp  (f  g(s)ds)dr  +  M  exp  (J g  (s  )ds  ) 


<  J  d(T)f  ( r)exp(fg(s)ds)dr 

tx  T 

i  b 

+  Till  (tj  )+/  9  (r)/  {f)exp  ( fg  (s  )ds  )d  r]exp  (fg(s)ds) 

j— 1  0  r  t 


■  h(tj)exp(  S  h(tk))\. 

k  =  j  +1 

By  induction,  (3.5)  holds  for  any  i  and  the  result  (3.2)  follows.  If  /  (t )  =  /  ,  (3.5) 
becomes 


<f>i(t)  <  -  f  +  fexp(J  g  (s  )ds  )  +  / exp  (f  g  (s  )ds  )h  {tj  )exp  (  J]  h(tk)) 

0  j— 1  0  k=j+ 1 


f  exp  (f  g{s)ds)  1  +  £  h  (*j  )exP  (  S  h  (h  ))  "  / 

o  j'=i  *=y+i 


<  fexp(J  g(s)ds)exp(Y;  h(tj))  -  f 
o  / =1 


for  <  f  <  (,  .  Thus  (3.3)  follows. 
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QED 


Remark.  If 

t  t  i 

4>(t)<f  {t)+Jg(r)<t>(T)dT  +  Jh1(T)(j)(T)dN1(T)+jh2(T)(l>(T)dN2(T)  (3.6) 

0  0  0 

where  iV1( t)  and  7V2(t)  are  independent  Poisson  counting  processes  with  zero  probability 
of  simultaneous  jumps.  Then  we  can  define  a  Poisson  process  N(t)  —  Ar1(t)  +  N2(t) 
and  a  random  process  {/iz(t)}  such  that  p(  t)  =  i,  tj  <  t  <  tj  +1  if  TV,  ( tj )  increases, 
i—  1,  2.  Thus  (3.6)  is  equivalent  to 

t  t 

0(0  <  f  (t)  +  J  g(T)<j>(T)dT  +  J  h dN(f)  a.s. 

0  0 


By  Lemma  3.1, 


0(0  <  /  (O  +  f  9(T)f  (r)exp(f  g(s)ds)dr 

0  T 

t  7  7  1 

+  /[/  (7)+ f  9  ( r)J  ( r)exp  (J g  ( s  )ds  )d  r)exp  (Jg(s)ds) 

0  0  7 -  7 

t 

•  h  n(i)i.i)exp  (f  h  li{t)(s)dN (s))dN (7)  a.s. 

7 

t  t 

=  f  (O  +  f  g  (r)f  (r)exp(f  g{s)ds)dr 

0  r 

t  7  7  < 

+  /[/  (7)+ /  <7  (t-)/  (r)ear  (J<7  («  )<fe  )<*  r]e:rp  (J </  (s  )«fe  ) 

0  0  t  7 

t  t 

■  h  t(j )exp  ( J h  i(s  )cWV )+//*  2(s  )dN  2{s  ))dN 7) 

7  7 

t  7  7  ( 

+  /[/  (7)+/  ffOO/  (r)exp(fg(s)ds)dr]exp(fg(s')ds) 

0  0  T  7 

t  t 

■  h2(i)exp  (J  h1{s)dN  1(s)+jh2{s)dN2{s))dN2{-i)  a.s.  (3.7) 


If  /  (f )  =  /  ,  then 
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<t>{t)  <  /  exp  (J  g  (s  )ds  )  exp  (f  h  x(s  )dN  ))  exp  (J  h  2(s  )dN2(s  ))  a.s.  (3.8) 


Lemma  3.2.  Suppose  f(t,s)  >  0,  0  <  s  <  f,  is  a  continuous  real  random  func- 
t  t 

tion.  Then  J  f  (t  ,s)ds  is  uniformly  bounded  in  t  a.s.  if  and  only  if  J  f  (t  ,s  )dN (s  )  is 
o  o 

uniformly  bounded  in  t  a.s.;  where  N(t)  is  a  Poisson  counting  process. 

Proof.  Let  {t;  }  be  the  interarrival  times  corresponding  to  N(t)  and  let 

t 

ti  =  rx  +  •  •  •  +  r2-  be  the  occurrence  time.  If  J f  ( t,s)ds  is  uniformly  bounded  in 

o 

t  a.s.  and  /  (t  ,s )  >  0,  0  <  s  <  t ,  then  there  exists  M  (oo)  >  0  which  is  independent 


1  1^  /*  I  11 

T  E  /  (*.«;■)  <  A/(w)  for  aj  E  [—(i -1),— j  ; 


for  all  I  >0  and  sufficiently  large  k,  [— ]  denotes  the  largest  integer  <  —  .  Thus,  the 

k  k 


f  (t  >t  i)  +  f  {t  ,t2)  +  •  •  •  +  f  (t  ,tj),  ti  <  t  <  t[+1 


(3.10) 


diverges  as  t  —►  oo  only  if  =t  subsequence  { rk  }  such  that  rk  — >•  0.  However, 

P  (rk  <  — )  =  1  -  e  ~'K^k  <  1.  Then  rk  — ►  0  with  probability  0.  Hence  (3.9)  converges 
k 

t 

a.s.  and  so  J  f  (t  ,s  )dN  (s  )  is  uniformly  bounded  in  t.  Conversely,  if  (3.10)  converges 
o 

a.s.,  then  (3.9)  diverges  as  t  — *•  oo  for  any  k  only  if  =j  subsequence  {rk  }  such  that  rk  — ► 

t 

oo.  But,  P  (r  >  T )  =  e  ~xr  <  1.  Thus,  rk  <  oo  a.s.  Hence  f  f  ( t,s)ds  is  also  uni- 

o 

formly  bounded  in  t  a.s. 


QED 
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Now,  we  turn  to  our  problem:  To  show  that  xT  (t ) —>■  x^t)  with  control 
uT(t)  —y  u  qq(^  ),  as  T  — ►  oo.  Since 

dxT  (t)  =  Axp  (t  )dt  +  Buj'(t)dt  +  Gxp(t)dN1(t)  +  DuT(t)dN2{t )  +  t; dN3(t ) 

with  up  (t)  =  —  [Kp  (t  )xp  (t )  ~h  kp  (t )],  then 

dxT  (t)  —  [(A  -BKt  (t  ))xT  (t)-BkT  (t)\dt  +  CxT(t)dN1(t) 

-  D  [KT  (t  )xT  (t  )+kT  (t  )]dN 2(t)  +  £dN3(t) 

=  [ /\  —  B  I^p  (t  }}xp  (t  )dt  —  BkT  (t  'jdt  —  \1CxT  (f  )dt  t  \2DBp  ( £  ^)xp  (t  )dt 
+  G  xT  (t  )dN1(t )  -  DKp  (t  )xT  (t  )dN 2(t )  -  DkT(t)dN2(t)  +  £dNa(t) 
=  [A  -BK00(t)}xT(t)dt-B(K00-KT(t))xp(t)dt  +  [ \2DKT(t)-\1C}xT(t)dt 
-  Bkp(t)dt  +  Cxp(t)dN1(t)  -  DKp(t)xT(t)dN2(t)  -  Dkp  (t)dN 2(f) 

+  £dN3(t).  (3.11) 

Thus, 

t 

xT(t)  =  exp  {A  -BKoJt  xQ  +  f  exp  (A  -  B  K^t-s  ){B  [K^  -  KT  (s  )] 

0 

t 

+  \2DKp(s)-\tC  }xT{s)ds  -  f  exp  {A  -  B  K00){t-s)Bkp{s)ds 

o 

i 

+  J  exp  (A  -  B  K  00)(t-s)lCxp(s)dN  i(s  )-DKp  (s)xp(s  )dN2(s)} 
o 

t 

+  f  exp  {A  -  B  kO0)(t-s)[-DkT{s)dN2{s)+£dN3(s)\. 
o 

Since  (A  -  B K is  stable,  =j  M  and  /?  >  0  such  that 

|  exp(A  -  BK00)(t-s)\  <Me~^-s\ 

so  that 

t 

|  MO  I  <M  e^1  I  x0  I  +  J  M  e-W-8)\\B\\  \kT(s)\ds 

0 
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t 

+  J  M  e~^[\\D\\  \kT(s)\dN2(s)+\Z\dNs(s)} 

0 

t 

+  J{Me-W-*)l\\B\\  \\K00-Kt(s)\\ 

0 

+  X1||C7||+X2||Z)||  | \Kt (s ) 1 1]  |sT(«)l  }  ds 

t 

+  J{Me-W-s\\\C\\  \xT(s)\dN1(s) 

0 

+  \\D  II  ||icr(5)||  |  xT(s)  I  dN2(s)}.  } 

(3.12) 

Since  kT{ t)  and  K y(t)  are  uniformly  bounded  for  0  <  t  <  T  <  oo  and 
t 

J  e -W-S)ds  =  -  e |  J  =  —  (1  -  e  "*)  <  -  ,  V  t  >  0, 

o  P  P  P 

so  that  the  third  term  on  the  right  hand  side  of  (3.12)  is  finite  a.s.  for  every  t  >  0  by 

Lemma  3.2.  The  second  term  is  easily  shown  to  be  finite  \~A  t.  Thus,  M1(cu)  >  0 

and  constant  M2,  M3  and  M4  such  that  (3.12)  becomes 

t 

\xT(t,u>)\  <  Mj(w)  +  f  M2e  xT(s,u>)  \  ds 

o 

t  t 

+  J  Mae  ~  P(t~s'>  |  xT  (s  ,co)  |  dN^s  )  +  J  M4e  ~  ^t~s'1  \  xT(s  ,ui)  \  dN2(s). 

0  0 

By  the  Stochastic  Gronwall  Lemma  3.1, 

t  t 

\xT(t,w)  \  <  M  1(cu)  exp  ( J  M  2e  ~  ^l~s^ds  )  exp  (f  Mae  ~  ^l~s^dN ^{s)) 

0  0 

t 

■  exp  (/  A4ie-^t~s]dN2{s))  a.s.  (3.13) 

o 

Since  each  integrand  in  each  exponent  is  uniformly  integrable  over  [0,f),  V  t  ,  by 
Lemma  3.2,  \xT(t  ,oj)  |  is  uniformly  bounded  for  0  <  t  <  T ,  V/  T  >  0.  Furthermore, 
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deceit )  =  Ax  ^(t  )dt  +  BUoo{t)dt  +  Cx00(t)dN1(t) 
+  DuO0(t)dN  2{t )  +  £dNa(t) 


with  uTO(t)  =  -  [K^x^t)  +  k J.  We  have 

dXooV)  =  {A-BK00)x00(t)dt  -  Bk^dt  +  Cx00(t)dN1{t ) 

-  DK00x00(t)dN2(t)  -  DkO0dN2(t )  +  $dNa(t) 

=  (A  -BK00)x00(t)dt  -  Bk^dt  -  DkOQdN2(t )  +  £dN3(t) 

—  \±Cx 00(t)dt  -j-  \2DK qqX ^(f )  dt 
+  CxO0(t)dN1(t)  -  DK00x00(t)dN2(t) 

so  that 

d{x^t)-xT{t)}  =  [A  -BK^x^tyx-pit^dt  -  Blk^-kj-it^dt 

-  D  [k 00-kT(t)}dN2(t)  +  B  [K 0O-KT(t)]xT(t)dt 

~  X  j C  [x )— Xf  ( t )]  dt  +  \2D  \K qqX ^(t )— Kj  (t  )xf  (f  )}dt 

+  c  (*<»(0-zr(0]<Wi(0 

-  D{K00x00(t)~KT(t)xT(t)}dN2(t). 

Then 

t 

*oo(0 -xT{t)  =  -  f  exp  (A  -B  K00)(t-s)B{k00-kT(s)]ds 
0 

t 

-  f  exp  (A  -  B  K00)(t-s)D  [k^-  kT(s)}dN2(s) 

0 

t 

+  f  exp  (A  -  B  K00)(t~s)B  [K  ^  ~  KT(s)]xT(s)ds 
0 

t 

+  f  exp(A  -BK00)(t-s)[-\1C-h\2DK00}[x00(syxT(s)}ds 
o 

t 

+  f  exp(A  -BK^it-sWlx^syxTisydN^s) 

0 

t 

~  f  exp(A  -  B  K00){t-s)DK00[x0a{s)-xT(s)}dN2{s) 

0 
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Thus 


t 

-  f  exp  (A  -BK00){t-s)D[K00-KT(s)]xT(s)dN2{s). 

0 

t 

*oo(*  I  <  fM  ||  |  ^oo-^r(s)  I 

0 

t 

+  f  M  e  -P{t~s)\\D  j|  I  &00  -  &r(s  )  I  <^V2(S  ) 

0 

t 

+  JM  e-H'-'UBU  ||/C0o-/Cr(S)||  1^(01* 

0 

t 

+  J  M  e -W  —  >|  |Z>  ||  11^00-7^(5)11  |M«)|  ^2(S) 

0 

t 

+  /{Me-*‘-‘>[X1||(7||-/-X2||Z?||  |  I/Cool  |] 

0 

•  I  *oo(s)-zr(s)  I  }ds  (3.14) 

t 

+  J  M  e -*‘-)\\C  ||  |  a:00(s)-a:r(s)  |  dN^s) 

0 

t 

+  J  M  e  1 1£>  ||  ll/C^H  I  *«,(«)-*!-(«)  I  <W2(s). 

o 

Since  (2.26),  (2.31)  and  xr(s)  is  uniformly  bounded,  we  argue  as  before  that  for 
each  e  >  0,  =\  d0  <  dx,  such  that  for  all  T  >  d  v  the  sum  of  the  first  four  integrals  of 
(3.14)  is  less  than  eM ^oj)  for  some  M1(w)  >  0,  t  (E  [0 ,T-d0]  and  rj  constants  M2,  Ms 
and  M4,  such  that 

t 

I  Xoo(t)-xT(t)  |  <  fMj(w)  +  f  M2e~l3it~s)\x00(syxT{s)\  ds 

0 

t 

+  J  M3e-^-^\x00(s)-xT{s)\dN1(s) 

0 

t 

+  f  M4  e  |  iJsHrts)  I  dNa(s).  (3.15) 

o 

Again  we  apply  the  Stochastic  Gronwall  Lemma  3.1  and  Lemma  3.2  to  (3.15).  We  get 
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the  result 


t  t 

I  xoo(t)-~xT(t)  I  <  eMfpj)  exp  (f  M2  e  )  exp  (J  M3  e  ~W~s',dN1(s)) 

o  o 

t 

■  exp  ( J  M4  e  -  W  "s ) dN2(s  )) 
o 

<  eM5(cu) 

for  some  M6(w),  all  t  £  [0,  T-d0]  and  all  T  >  dv  Thus, 

I  «r(0-«<x>(f ) !  <  ||#r(0 

-*JI  l^rCOl  +  1 1/^00 1 1  I  *r(0-*oo(0  I  +  |&r(£)_&oo| 

<  eM6(cu) 

for  some  Ma(w),  all  (  £  [0,  T-d0]  and  all  T  >  dv  Hence, 

■jt  (uoo)~Jt(uT  )1 
T 

=  4t  £  /  {[*oo(0'  <3  ^oo(0-^r(0'  <3  MO] 

1  0 

T  [«oo(0'-ft  Uoo(t)-UT(t  )' R  uT(t)]}dt  (3.16) 

T 

=  #  /  {[*«>(*  )~M 01 'Q*co(0+MO'Q[*co(*)-Mf)] 

i  0 

+  [«oo(0-«r(0]'-R  «oo(0+«r(0''R  [«oo(*)-«r(*  )]}*• 

Since  Zy ,  a:^,  uT  and  are  uniformly  bounded,  the  integral  of  (3.16)  can  be  parti¬ 
tioned  into  two  parts;  the  first  integral  over  [0 ,T-d0]  is  less  than  eM7(oo)  for  some 
M7(<jj)  >  0  while  the  second  integral  over  [T-d0,T]  tends  to  zero  as  T  — ►  oo.  Hence, 
(3.16)  tends  to  0  as  T  — ►  oo  and  (2.33)  follows.  We  summarize  the  entire  analysis  in  the 
following  theorem. 

Theorem  3.3.  Suppose  all  the  coefficient  matrices  of  (1.1)  are  constant.  If 
(A+XiC,  B+\2D)  is  stabilizable,  (Q ,  A+\4C)  is  observable  and  condition  (I)  in  section  2 
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holds,  then  the  optimal  control  exists  and  is  of  the  form 


UOo(0  ^  0OX  CoW)  k  oo  (3-17) 

with 

K  oo  —  [R  +  ^2^  ’  PooD  1  X(P  +  \2D  )'  P  0 o 

k  co  —  [P  +  ^2®  PcoP  ]  HB  +  \P)'Poo  (3.18) 

where  P  ^  and  p  ^  are  the  unique  solutions  of 

{A  +\1C)'  P  +  P(A+\C)  +  Xx C'PC  +  Q 
T  P  ( B  +X2Z)  )[/?  ->-X2Z)  PD  ]  1(Z?  +X2D  )  P  =  0  (3.19) 

and 

\A  +XxC7  —  (B  +X2j9  ){P  +X2,D  PD  )  *(B  +X2.D  )  P]  p  +  X3 P  =  o  (3.20) 

respectively.  The  optimal  average  cost  is 

Jav  (u  oo)  2X3^  Pqq  +  X3£  P  oo£  —  ^oo  R<x>k  oo-  (3.21) 

Remark.  The  optimal  control  in  (3.17)  of  the  infinite  time  problem  is  a  time 

invariant  linear  feedback  control  plus  a  stationary  feed-forward  control.  The  additive 
noise  only  affects  the  feed-forward  control.  Moreover,  both  gains  in  (3.18)  are  quite  sen¬ 
sitive  to  the  coefficients  C  ,D  of  the  state-  and  control-dependent  noises,  respectively.  In 
general,  large  state  dependent  noise  can  destabilize  the  system  (l.l)  while  large  control 
dependent  noise  may  diminish  the  effects  of  the  gain  K ^  and  k 0O,  and  increase  the  aver¬ 
age  cost  in  (3.21).  Note  that  the  matrices  C  and  D  should  be  small  in  norm  to  guaran¬ 
tee  condition  (I). 

4.  The  Case  of  Discounted  Cost. 

If  we  use  the  discounted  cost  criterion  (2.2),  we  can  define  a  new  state  x(t)  and  a 
new  control  u  (t )  by 
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x  (t)  =  e  at  x  (t ) 

u(t)  t  e~atu(t).  (4.1) 

Then  (2.2)  becomes  the  limit  of 

T 

JT(u)  =  E  J  [x  (t ) '  Q  x  (t )  +  u  (t)' R  u  (t)\dt  (4.2) 

o 

when  T  — *  oo.  Now  the  new  state  dynamics  are 

dx(t)=  -  ae~atx(t)dt  +  e  ~at  [A  x(t)dt+B  u{t)dt ] 

+  {e~at[x(t)+C  x  (t)\-e  ~  at  x  (t)}dN  x(t) 

+  {e  ~  at  [x  (t)-\-D  u  (t  )]-e  “  at  x  (t  )}dN2(t ) 

+  {e  ~at  [*(0+^(0]-e  ~at  x(t)}dN3(t) 

=  {A-aI)x(t)dt  +  B  u(t)dt  +  C  x(t)dN1(t) 

+  D  u(t)dN2(t)  +  e  ~at  £dN3(t).  (4.3) 

From  Theorem  1.1,  for  each  T  >  0,  the  optimal  control  is 

u  T(t)  =  -  K  T(t)x  f(t)  -  k  r(t) 

with 

K  T(t)  —  [R+\2D  'P  T(t)D}-\B+\2D)'  ~P  T(t) 
k  T(t)  =  [R  +\2D  P  T(t)D  }  1(B  +\2D  )  p  T{t). 

The  optimal  value  is 

JT(u  t)  ~  x0' P  t(0)xo  +  2p7-(0)'x0  +  qT(  0) 

where  P  T(t)  >  0,  p  T  (t )  and  q  T(t)  are  unique  solution  of 

—P(t)  +  {A  +\  x  C  ~al )'  P(t)  +  P  (t)(A  +\  l  C  -a  I )  +  \C'P(t)C 
dt 

+  Q  -  P{t){B+\2D)\R+\2D  ' P{t)D)~\B+\2D)' P{t)  =  0  (4'4) 

P(T)  =  o 
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—j ~P  (0  +  {A  +X1C 

at 

+  \3P(t)e-att; 

P(T)  =  0 

and 

4-q{t)  +  2 \3e~atCp(t)  +  \3e~2atCP(tK 
at 

-  p{t)' (B+\2D)[R+\2D  '  P{t)D}~\B+\2D)' p{t)  =  0  (4'6) 

<1(T)  =  o 

respectively.  In  the  same  manner  as  before,  if  (A  +X1C'-o;I)  B  +\2D  )  is  stabilizable,  (Q , 
A  +X16'-a;I)  is  detectable  and  condition  (I)  in  section  2  holds  with  A  being  replaced  by 
A  +\1C-al,  then  P  T(  t)  f  P  oo  uniformly  in  [0  ,T-d0]  as  T  — >•  oo  for  some  d0  and  P 
>  0  satisfies  the  algebraic  eciuation 

(A  iC —ex I ) '  P  +  P  (/l  d-Xj^C*—  cx.1 )  ~h  Xj G  P(t)C  +  Q 

-  P  (B +\2D  ){R +\2D  '  P{t)D)-\B+\2D)'  P  =  o.  (4.7) 

In  addition, 

[A  +Xj(7  -  al  -(B  +\2D  ){R  +\2D  '  P  )~\B  +\2D  )'FJ 

is  a  stable  matrix.  If  {Q1/2,  A  +X1C'-aI)  is  observable,  then  P  ^  is  the  unique  solution 
of  (4.7)  in  the  solution  class  P  >  0. 

Let 

R  OO  —  P  P  ^2P  P  oqD 
Koo=[R  +^D  '  P  oo  D  r\B  +\2D  )'PO0. 

Let  ,s)  be  the  transition  matrix  of 

A  -\-\1C  -al  ~{B  +\D  )[R  +\2D  '  P(t  )D  }~\B  +\2D  )'  P{t). 

Then  the  solution  Pj-{t  )  of  (4.S)  becomes 


-«/-(£ +X2£>)[i?+X2JD  '  P{t)DY\B  +\2DY  P{t)}'  p(t ) 


=  o 


(4.5) 
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p  T(t)  =  \3f  &r(T,t)'P  T(r)e  aTe,d- 


X3J  exp[{A+\1C-aI-~(B+\2D)K  <x>)'  (r-t)}  P  aT£dr 


-  c  -  at  X3[A  +X1C'-2a/-(P  +\2D  )K  J '  ~lP 


-  e  “  at  v 

—  G  r  oo 


where  the  convergence  is  uniform  in  [0,  T-d0}  as  T  — ►  oo  for  some  >  0. 
On  the  other  hand,  we  want  to  show 


q  r(t)  —  f  {2X3e  aTCPr O)  +  x3e  2aT z,' P  T {t)£,  -  PrW'f5+^) 


[72  +X2Z)  'Pr(r).D]  ^P+Pj'p  T(r)}rfr 


converges  to 


Thus, 


/  {2X3e-2*re'Poo  +  \3e~2ar^P0 0£  -  e~2aTp  00'  (B  +\2D  ) 


■  [R  T  X2P  '  P  oo P  ]  *(P  -tD  ) '  p  oo]^  r 

=  -1-  e-2a'  [2X3^' p  oo  +  x3e'P  ^ 

2a 

~  P  00  '  (P  )  (P  +^2^  P  00P  )  X(P  +^2^  )  P 

A  g -2a f  ’ 

—  c  7  OO1 


U  t(0  -  e  2a*  9  00  I 


<  /  {2X3e  "“re'b  rW-e  "arPool  +  [P  r  (r)-P  ool£ 


-  [p  r  (rPe  arP  00Y  BR  T  (t)  1B  '  p  T  (r) 
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(4.10) 


-  e  aTp  ^B  R  T(r)  X{R  OQ- R  T  ( t)}R  JB  '  p  T  ( r ) 

-  e~aTPoaBR^B'  [pT(r)-c  -“>«,]}<*  r  |  +  e^Uoo 

The  last  term  in  (4.10)  tends  to  0  as  T  — ►  oo  which  is  independent  of  £ .  The  integra¬ 
tion  in  (4.10)  over  [t,T]  can  be  divided  into  two  parts;  the  first  integration  over 
[T -d 0,T]  can  be  made  less  than  e  for  each  d0  such  that  T  is  sufficiently  large  while  the 
second  integration  over  [t  ,T -d 0]  is  less  than  M for  some  constant  Mv  which  is 
independent  of  t,  since  [  \P  T  (t)-P  1 1  <  c,  |  \R  T  {r)-R  ^  1 1  <  e,  and 

I  P  T  (r)~e  ~  arp  oo  I  <  e  for  all  r  £  [0,  T-d0(e)]  with  some  d0(e).  Thus,  (4.10)  tends  to  0 
uniformly  on  [0,  T)  as  T  — *•  oo.  Hence, 

lim  JT(uT)~  lirn  [x0'PT(0)x0  +  2pT(0)'x0  +  gy(0)] 

T-*oo  T— oo 

~  x  0  P  oox  0  +  oo' X 0  +  q  oo 

1  J*.  (4.11) 

If  3  u  such  that  /rf(u)  <  J*,  then  for  large  T ,  we  have  Jf(u)  <  JT(u  T)  which  con¬ 
tradicts  JT(u  T)  being  the  optimal  value  for  the  problem  restricted  on  [0, T],  so  J*  must 
be  the  optimal  value. 

As  before,  we  can  also  prove  x  T(t)  — *•  ijf )  as  T  —*■  oo  and 

uT(t)=  -  K  T{t)x  T{t)  -  k  T(t) 

-+  -  #00*  00(0  -  =  «oo  (O 

T  —  >00 

where 

K  00  —  [R  +  X2P  ’  P  00P  1  *(P  T  X2£> ) '  P  oq 

^00  =[#  +  \P>  'P  ooD  +\2DyPoo.  (4.12) 

The  convergence  is  uniform  on  [0,  T-d0]  as  T  — ►  00  for  some  d0  >  0.  Thus  the  conver¬ 
gence  of  xT(t )  — ►  )  and  that  of 

ut(0  =  ~  ~  e<Xi  k  f  {t) 
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T  -*oo 


“oo(0 


(4.13) 


^  oo"^  oo(  O  fa  00  — 

are  uniform  on  each  bounded  interval.  Thus, 

\  JT(uT)  -  Jt  (u  oo)  I 

r 

=  J  e~2at  {\xT(t)'  Q  xT(t)-x  m(t)'  Q  x  00(f)] 

0 

+  [uT(t)'R  «r(0~Moo(0' R  «oo(0]}^ 

T 

<E  J  e-2at  {[xTityx^it)}' Q  xT{t)  +  x  oo(t)' Q  [xT(t)-x  QQ^t)} 

0 

+  iuT  (t  )~u  <x>(t  )}'  R  UT(t)  +  «oo(0'-R  [ur(0-Woo(0]}^  •  (4.14) 

Since  the  integrand  of  (4.14)  is  uniformly  bounded  in  0  <  t  <  T ,  then  for  each  e  >  0, 
we  can  find  a  sufficiently  large  d  >  0  such  that  the  integrand  of  (4.14)  integrated  over 
[d,T]  can  be  made  less  than  eM2(oj)  for  all  T  >  0.  For  this  bounded  interval  [0 ,d], 
there  exists  T0,  such  that 

|  Xf(t)  -  x yt )  |  <  e  and  |  uT  (t )  -  u  ^(f )  |  <  e  a.s. 

for  all  t  £  [0 ,d\  and  T  >  T0,  so  that  the  integrand  of  (4.14)  integrated  over  [0,6?]  is 
less  that  fM3(w)  for  some  M3(cu)  >  0,  Hence  (4.14)  can  be  made  small  as  T  — *■  00,  so 
that 

Ji(u  00)=  lim  JT{u  00)  =  lim  4(“r) 

T  ~*oo  T  --*00 

which  shows  that  u  ^  in  (4.13)  is  the  optimal  control. 

We  summarize  these  results  as  follows. 

Theorem  4.1.  In  the  discounted  cost  case,  if  (A +\lC -a\,  B+\2D)  is  stabiliz- 
able,  ( Q 1//2,  A  +\1C-aI)  is  observable  and  condition  (I)  in  section  2  holds  with 
( A  +\tC  -otl)  instead  of  A  ,  then  the  optimal  control  exists  and  is  of  the  form 
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^  oo(^  ) 


^  oo*^  oo(  0 


(4.15) 


where  K  ^  and  k  ^  are  defined  in  (4.12).  The  optimal  discounted  cost  is 

Jd(uoo)  =  Xq'P^Xq  +  2poo'x0  +  q oo  (4.16) 

where  P  ^  ,  P  ^  and  q  ^  are  defined  in  (4.7),  (4.8)  and  (4.9)  respectively. 

Remark.  The  average  cost  criterion  measures  the  long  run  performance  on  the 
average.  It  neglects  the  behavior  of  the  system  over  any  finite  interval  while  the 
discount  cost  criterion  emphasizes  the  initial  performance,  in  particular,  the  initial  condi¬ 
tion  x0  as  in  (4.16).  However,  the  optimal  control  involves  a  time-invariant  linear  feed¬ 
back  control  and  a  stationary  feed-forward  control  in  both  situations. 
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